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Abstract 

We analyze linkage strategies for a set X of wcbpages for which the 
webmaster wants to maximize the sum of Google's PageRank scores. 
The webmaster can only choose the hyperlinks starting from the web- 
pages of X and has no control on the hyperlinks from other webpages. 
We provide an optimal linkage strategy under some reasonable assump- 
tions. 
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1 Introduction 

PageRank, a measure of webpages' relevance introduced by Brin and Page, is 
at the heart of the well known search engine Google [6l [15] . Google classifies 
the webpages according to the pertinence scores given by PageRank, which 
are computed from the graph structure of the Web. A page with a high 
PageRank will appear among the first items in the list of pages corresponding 
to a particular query. 

If we look at the popularity of Google, it is not surprising that some 
webmasters want to increase the PageRank of their webpages in order to 
get more visits from websurfers to their website. Since PageRank is based 
on the link structure of the Web, it is therefore useful to understand how 
addition or deletion of hyperlinks influence it. 

Mathematical analysis of PageRank's sensitivity with respect to pertur- 
bations of the matrix describing the webgraph is a topical subject of interest 
(see for instance [2J HU Q21 Q31 E] and the references therein). Normwise 
and componentwise conditioning bounds [TT] as well as the derivative [121113] 
are used to understand the sensitivity of the PageRank vector. It appears 
that the PageRank vector is relatively insensitive to small changes in the 
graph structure, at least when these changes concern webpages with a low 
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PageRank score [5j [12]. One could think therefore that trying to modify 
its PageRank via changes in the link structure of the Web is a waste of 
time. However, what is important for webmasters is not the values of the 
PageRank vector but the ranking that ensues from it. Lempel and Morel |14j 
showed that PageRank is not rank-stable, i.e. small modifications in the link 
structure of the webgraph may cause dramatic changes in the ranking of the 
webpages. Therefore, the question of how the PageRank of a particular page 
or set of pages could be increased-even slightly-by adding or removing links 
to the webgraph remains of interest. 

As it is well known [HE], if a hyperlink from a page i to a page j is 
added, without no other modification in the Web, then the PageRank of j 
will increase. But in general, you do not have control on the inlinks of your 
webpage unless you pay another webmaster to add a hyperlink from his/her 
page to your or you make an alliance with him/her by trading a link for a 
link [31 [8]. But it is natural to ask how you could modify your PageRank by 
yourself. This leads to analyze how the choice of the outlinks of a page can 
influence its own PageRank. Sydow [T7] showed via numerical simulations 
that adding well chosen outlinks to a webpage may increase significantly its 
PageRank ranking. Avrachenkov and Litvak [2] analyzed theoretically the 
possible effect of new outlinks on the PageRank of a page and its neighbors. 
Supposing that a webpage has control only on its outlinks, they gave the 
optimal linkage strategy for this single page. Bianchini et al. [5] as well as 
Avrachenkov and Litvak in [T] consider the impact of links between web 
communities (websites or sets of related webpages), respectively on the sum 
of the PageRanks and on the individual PageRank scores of the pages of 
some community. They give general rules in order to have a PageRank as 
high as possible but they do not provide an optimal link structure for a 
website. 

Our aim in this paper is to find a generalization of Avrachenkov-Litvak's 
optimal linkage strategy [2] to the case of a website with several pages. We 
consider a given set of pages and suppose we have only control on the outlinks 
of these pages. We are interested in the problem of maximizing the sum of 
the PageRanks of these pages. 

Suppose Q = (M, £) be the webgraph, with a set of nodes M = {1, . . . , re} 
and a set of links £ C J\f x J\f. For a subset of nodes T C J\f, we define 

£x = {(i,j) € £: i,j € 1} the set of internal links, 
^-out(i) — {{hi) 6 £'■ i € 1, j 1} the set of external outlinks, 
^in(J) = {(hj) & £'■ i ^X,j GT} the set of external inlinks, 
£j = {(i,j) € £ : i, j £ 1} the set of external links. 

If we do not impose any condition on £j and £ ut(x)> the- problem of 
maximizing the sum of the PageRanks of pages of X is quite trivial and does 
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not have much interest (see the discussion in Section 0]). Therefore, when 
characterizing optimal link structures, we will make the following accessibil- 
ity assumption: every page of the website must have an access to the rest 
of the Web. 

Our first main result concerns the optimal outlink structure for a given 
website. In the case where the subgraph corresponding to the website is 
strongly connected, Theorem I1UI can be particularized as follows. 

Theorem. Let Ex, £i n (x) and £j be given. Suppose that the subgraph (X,£x) 
is strongly connected and Ex 7^ 0. Then every optimal outlink structure 
£ ou t(x) i- s to have only one outlink to a particular page outside of I. 

We are also interested in the optimal internal link structure for a website. 
In the case where there is a unique leaking node in the website, that is only 
one node linking to the rest of the web, Theorem [TT] can be particularized 
as follows. 

Theorem. Let £ ou t(i)> £wx) an d £j be given. Suppose that there is only one 
leaking node in I. Then every optimal internal link structure Ex is composed 
of together with every possible backward link. 

Putting together Theorems 1101 and II 1\ we get in Theorem [12] the optimal 
link structure for a website. This optimal structure is illustrated in Figure [TJ 

Theorem. Let £ m (x) and £j be given. Then, for every optimal link struc- 
ture, Ex is composed of a forward chain of links together with every possible 
backward link, and £ ou t(x) consists of a unique outlink, starting from the last 
node of the chain. 



I 




Figure 1: Every optimal linkage strategy for a set X of five pages must 
have this structure. 



This paper is organized as follows. In the following preliminary section, 
we recall some graph concepts as well as the definition of the PageRank, and 
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we introduce some notations. In Section [3j we develop tools for analysing the 
PageRank of a set of pages X. Then we come to the main part of this paper: 
in Section H] we provide the optimal linkage strategy for a set of nodes. In 
Section [5l we give some extensions and variants of the main theorems. We 
end this paper with some concluding remarks. 

2 Graphs and PageRank 

Let Q = (TV, (?) be a directed graph representing the Web. The webpages 
are represented by the set of nodes TV = {1, . . . , n} and the hyperlinks are 
represented by the set of directed links £ C TV x TV. That means that 
£ £ if and only if there exists a hyperlink linking page i to page j. 
Let us first briefly recall some usual concepts about directed graphs (see 
for instance [5]). A link is said to be an outlink for node i and an 

inlink for node j. If G £, node i is called a parent of node j. By 

J <- h 

we mean that j belongs to the set of children of i, that is j G {k G TV: (i, k) G 
£}. The outdegree di of a node i is its number of children, that is 

ck = \{jeM: (i,j)e£}\. 

A path from iq to i s is a sequence of nodes (io, i\, . . . , i s ) such that (ifc, ifc+i) G 
£ for every fe = 0, l,...,s — 1. A node i has an access to a node j if there 
exists a path from i to j. In this paper, we will also say that a node i has an 
access to a set J if i has an access to at least one node j G J . The graph Q 
is strongly connected if every node of TV has an access to every other node 
of TV. A set of nodes T C TV is a imai dass of the graph = (TV, £) if the 
subgraph (F,£p) is strongly connected and moreover £ on t(F) = (i-e. nodes 
of do not have an access to TV \ J 7 ) . 

Let us now briefly introduce the PageRank score (see [5j EJ [I2j HU [15] 
for background). Without loss of generality (please refer to the book of 
Langville and Meyer [13] or the survey of Bianchini et al. [5] for details), 
we can make the assumption that each node has at least one outlink, i.e. 
di ^ for every i € TV. Therefore the n x n stochastic matrix P = [Pij]i jej\f 
given by 

1 otherwise, 

is well defined and is a scaling of the adjacency matrix of Q. Let also 
< c < 1 be a damping factor and z be a positive stochastic personalization 
vector, i.e. z, > for all i = 1, . . . ,n and £ T 1 = 1, where 1 denotes the 
vector of all ones. The Google matrix is then defined as 

G = cP + (l-c)lz T . 
4 
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Since z > and c < 1, this stochastic matrix is positive, i.e. Gij > for all 
The PageRank vector iv is then defined as the unique invariant measure 
of the matrix G, that is the unique left Perron vector of G, 



7T 



7T T 1 



7V T G, 
1. 



The PageRank of a node i is the i entry 7Tj = re; of the PageRank 
vector. 

The PageRank vector is usually interpreted as the stationary distribution 
of the following Markov chain (see for instance [13] ) : a random surfer moves 
on the webgraph, using hyperlinks between pages with a probability c and 
zapping to some new page according to the personalization vector with a 
probability (1 — c). The Google matrix G is the probability transition matrix 
of this random walk. In this stochastic interpretation, the PageRank of a 
node is equal to the inverse of its mean return time, that is n^ 1 is the mean 
number of steps a random surfer starting in node i will take for coming back 
to i (see [71 CD]). 



3 PageRank of a website 

We are interested in characterizing the PageRank of a set X. We define this 
as the sum 

where ej denotes the vector with a 1 in the entries of X and elsewhere. 
Note that the PageRank of a set corresponds to the notion of energy of a 
community in [5]. 

Let ICA^bea subset of the nodes of the graph. The PageRank of X can 
be expressed as 7v T ej = (1— c)z T (I— cP)~ l ej from PageRank equations ([T]). 
Let us then define the vector 

v = (I- cP)- x e x . (2) 

With this, we have the following expression for the PageRank of the set X: 

vr^ex = (1 - c)z T v. (3) 

The vector v will play a crucial role throughout this paper. In this 
section, we will first present a probabilistic interpretation for this vector 
and prove some of its properties. We will then show how it can be used in 
order to analyze the influence of some page i £ Ion the PageRank of the 
set X. We will end this section by briefly introducing the concept of basic 
absorbing graph, which will be useful in order to analyze optimal linkage 
strategies under some assumptions. 
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3.1 Mean number of visits before zapping 

Let us first see how the entries of the vector v = (I — cP)~ 1 ez can be 
interpreted. Let us consider a random surfer on the webgraph Q that, 
as described in Section [21 follows the hyperlinks of the webgraph with a 
probability c. But, instead of zapping to some page of Q with a proba- 
bility (1 — c), he stops his walk with probability (1 — c) at each step of 
time. This is equivalent to consider a random walk on the extended graph 
Qe = (-A/ - U{n + l},£U {(j,n + 1): i £ M}) with a transition probability 
matrix 



At each step of time, with probability 1 — c, the random surfer can disappear 
from the original graph, that is he can reach the absorbing node n + 1. 

The nonnegative matrix (I — cP)~ l is commonly called the fundamental 
matrix of the absorbing Markov chain defined by P e (see for instance |10|. 
[16]). In the extended graph Q e , the entry [(I — oP) -1 ]^- is the expected 
number of visits to node j before reaching the absorbing node n + 1 when 
starting from node i. From the point of view of the standard random surfer 
described in Section [21 the entry [(/ — cP)" 1 ]^- is the expected number of 
visits to node j before zapping for the first time when starting from node i. 

Therefore, the vector v defined in equation ([2]) has the following proba- 
bilistic interpretation. The entry Vi is the expected number of visits to the 
set T before zapping for the first time when the random surfer starts his 
walk in node i. 

Now, let us first prove some simple properties about this vector. 
Lemma 1. Let v £ M> be defined by v = cPv + ej. Then, 

(a) maxj^i), < c max^jVi, 

(b) Vi < 1 + cVi for all i E J\f; with equality if and only if the node i does 
not have an access to I, 

(c) v.- L > minj^j Vj for all i G X; with equality if and only if the node i 
does not have an access to I; 

Proof. (a) Since c < 1, for all i ^ T, 





Since c < 1, it then follows that m&xj Vj = maxj G x Vi. 
(b) The inequality v^ < follows directly from 



max Vi < max I 1 + c 
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From (a) it then also follows that < for all i T. Now, let 
i E N such that = j—. Then i E X. Moreover, 

-r, 



that is vj = for every j <— i. Hence node j must also belong to X. 
By induction, every node k such that i has an access to k must belong 
to X. 

(c) Let i E X. Then, by (b) 

l + ci)j>i7j = l + c >^ -7- > 1 + c min Vj , 

so > minj^j Vj for all i El. If i?j = minj^j Vj then also 1+cUj = Vi 
and hence, by (6), the node i does not have an access to X. □ 

Let us denote the set of nodes of X which on average give the most visits 
to X before zapping by 

V = argmax^j. 

j& 

Then the following lemma is quite intuitive. It says that, among the nodes 
of X, those which provide the higher mean number of visits to X are parents 
of X, i.e. parents of some node of X. 

Lemma 2 (Parents of X). If £{ Q m 7^ 0, then 

VC {j E X: there exists i E X such that (j,£) E £- m (x)}- 
If £\n(X) = 0> then Vj = for every j E X. 

Proof. Suppose first that £i n (x) 0- Let k E V with u = (/ — cP)~ 1 ej. If 
we supposed that there does not exist f 6 I such that (k,£) E £i n m, then 
we would have, since v^ > 0, 

— - < cmaxiij = cVk < Vk, 

j^ k d k - m 

which is a contradiction. Now, if £- m (x) = 0, then there is no access to X 
from X, so clearly Vj = for every j € X. □ 

Lemma [2] shows that the nodes j E X which provide the higher value 
of must belong to the set of parents of X. The converse is not true, as 
we will see in the following example: some parents of X can provide a lower 
mean number of visits to X that other nodes which are not parents of X. In 
other word, Lemma [2] gives a necessary but not sufficient condition in order 
to maximize the entry Vj for some j E X. 
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Figure 2: The node 6 ^ V and yet it is a parent oil = {1} (see Exam- 
ple©. 



Example 1. Let us see on an example that having G £m(i) f° r some 
i G X is not sufficient to have j G V. Consider the graph in Figure Let 
Z = {1} and take a damping factor c = 0.85. For v = (I — cP)~ 1 ei, we have 

v 2 = v 3 = v 4 = 4.359 > ^5 = 3.521 > v 6 = 3.492 > v 7 > ■ ■ ■ > vn, 

so V = {2, 3, 4}. As ensured by Lemma 121 every node of the set V is a parent 
of node 1. But here, V does not contain all parents of node 1. Indeed, the 
node 6 ^ V while it is a parent of 1 and is moreover its parent with the 
lowest outdegree. Moreover, we see in this example that node 5, which is a 
not a parent of node 1 but a parent of node 6, gives a higher value of the 
expected number of visits to X before zapping, than node 6, parent of 1. 
Let us try to get some intuition about that. When starting from node 6, 
a random surfer has probability one half to reach node 1 in only one step. 
But he has also a probability one half to move to node 11 and to be send 
far away from node 1. On the other side, when starting from node 5, the 
random surfer can not reach node 1 in only one step. But with probability 
3/4 he will reach one of the nodes 2, 3 or 4 in one step. And from these 
nodes, the websurfer stays very near to node 1 and can not be sent far away 
from it. 

In the next lemma, we show that from some node i G X which has an 
access to I, there always exists what we call a decreasing path to X. That is, 
we can find a path such that the mean number of visits to X is higher when 
starting from some node of the path than when starting from the successor 
of this node in the path. 

Lemma 3 (Decreasing paths to X). For every i$ G X which has an access 
to X, there exists a path (io, i\, . . . ,i s ) with i\, . . . , i s -i G X and i s G X such 
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that 



v io > v h > ... > v is . 



Proof. Let us simply construct a decreasing path recursively by 

i k+ i € argminuj, 

as long as G X. If has an access to X, then Vi k+l < Vi k < by 
LemmaQJ&) and (c), so the node ik+i has also an access to X. By assumption, 
io has an access to X. Moreover, the set X has a finite number of elements, 
so there must exist an s such that i s G X. □ 

3.2 Influence of the outlinks of a node 

We will now see how a modification of the outlinks of some node i £ j\f can 
change the PageRank of a subset of nodes I C j\f . So we will compare two 
graphs on j\f defined by their set of links, £ and £ respectively. 

Every item corresponding to the graph defined by the set of links £ will 
be written with a tilde symbol. So P denotes its scaled adjacency matrix, 
tv the corresponding PageRank vector, di = \{j : £ £}\ the outdegree 
of some node i in this graph, v = (I — cP)~ 1 ej and V = aigraax.-^jVj. 

Finally, by ft—i we mean j € {k: (i, k) € £}. 

So, let us consider two graphs defined respectively by their set of links £ 
and £. Suppose that they differ only in the links starting from some given 
node i, that is {j : (k,j) € £} = {j: (k,j) € £} for all k ^ i. Then their 
scaled adjacency matrices P and P are linked by a rank one correction. Let 
us then define the vector 

which gives the correction to apply to the line i of the matrix P in order to 
get P. 

Now let us first express the difference between the PageRank of X for two 
configurations differing only in the links starting from some node i. Note 
that in the following lemma the personalization vector z does not appear 
explicitly in the expression of 7? . 

Lemma 4. Let two graphs defined respectively by £ and £ and let i 6 M 
such that for all k^i, { j : (k,j) € £} = {j: (k,j) G £}■ Then 

7T ej = 7T ej + C 7Tj 



1-cS 1 (I-cP)- 1 ei 
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Proof. Clearly, the scaled adjacency matrices are linked by P = P + ej 8 T . 
Since c < 1, the matrix (/ — oP) -1 exists and the PageRank vectors can be 
expressed as 

7V T = (l-c)z T (I-cP)-\ 

n T = (1 - c)z T {I -c{P + ei8 T ))' x . 
Applying the Sherman-Morrison formula to ((I — cP) — ce,id T )~ 1 , we get 

— cP)~ l + (1 — ^ - ^ ^0^ , 

and the result follows immediately. □ 

Let us now give an equivalent condition in order to increase the PageR- 
ank of X by changing outlinks of some node i. The PageRank of X increases 
essentially when the new set of links favors nodes giving a higher mean 
number of visits to X before zapping. 

Theorem 5 (PageRank and mean number of visits before zapping). Let 

two graphs defined respectively by £ and £ and let i £ J\f such that for all 
k + i, {j: (k,j) ££} = {j: (k,j) G £}. Then 

■K T ex > 7T T ex if and only if 8 T v > 

and 7T T ej = 7v T ej if and only if 8 T v = 0. 

Proof. Let us first show that 8 T (I — cP)~ 1 ei < 1 is always verified. Let 
u = (I — cP)~ 1 ei. Then u — cPu = ej and, by Lemmata), Uj < ui for all 
j. So 



p u = y^_y^ < y*±i < y u 

4^.di ^.di ~ ^ <1, f-j 

1 j<-« j^i 



(I, f-! di r-'. di f-( di 

1 ,?«-» 3^ 

Now, since c < 1 and 7r > 0, the conclusion follows by Lemma [H □ 



The following Proposition [6] shows how to add a new link starting 
from a given node i in order to increase the PageRank of the set X. The 
PageRank of X increases as soon as a node i € I adds a link to a node j 
with a larger or equal expected number of visits to I before zapping. 

Proposition 6 (Adding a link). Let i € X and let j G jV be such that 
€" £ and vi < Vj. Let £ = £ U {(i, j)}. Then 

7T ej > tt ej 

with equality if and only if the node i does not have an access to X. 
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Proof. Let i £ X and let j £ be such that (i, j) £ £ and Vi < Vj. Then 

E v k 
-J- = Vi < 1 + CVi < 1 + CVj, 

k^i i 

with equality if and only if i does not have an access to X by Lemma[l{6). 
Let £ = £u{{i,j)}. Then 

ck + 1 V 3 t^.dij - ' 

with equality if and only if i does not have an access to I. The conclusion 
follows from Theorem [SJ □ 

Now let us see how to remove a link starting from a given node i in 
order to increase the PageRank of the set X. If a node i £ Af removes a link 
to its worst child from the point of view of the expected number of visits to 
X before zapping, then the PageRank of X increases. 

Proposition 7 (Removing a link). Let i G N and let j £ argmin fc4 _j v^. 
Let £ = £ \ {(i, j)} . Then 

tv ej > 7r ej 

with equality if and only if Vk = Vj for every k such that (i, k) £ £. 
Proof. Let i £ and let j £ argmin fe< _j v^. Let £ = £ \ {(i,j)}. Then 

Z-j _ 1) - 

with equality if and only if Vk = Vj for all k <— i. The conclusion follows by 
Theorem [5l □ 

In order to increase the PageRank of X with a new link (i,j), Proposi- 
tion only requires that Vj < V{. On the other side, Proposition [7] requires 
that Vj = min^^j Vk in order to increase the PageRank of I by deleting link 
One could wonder whether or not this condition could be weakened 
to Vj < v,^ so as to have symmetric conditions for the addition or deletion 
of links. In fact, this can not be done as shown in the following example. 

Example 2. Let us see by an example that the condition j £ argmin^j v^ 
in Proposition [7] can not be weakened to Vj < Vi . Consider the graph in 
Figure [3] and take a damping factor c = 0.85. Let X = {1, 2, 3}. We have 

vi = 2.63 > v 2 = 2.303 > v 3 = 1.533. 

As ensured by Proposition [7J if we remove the link (1,3), the PageRank of 
X increases (e.g. from 0.199 to 0.22 with a uniform personalization vector 
z = xrl), since 3 £ argmin fc< _ 1 v^. But, if we remove instead the link (1, 2), 
the PageRank of X decreases (from 0.199 to 0.179 with z uniform) even if 
v 2 < V\. 
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Figure 3: For X = {1,2,3}, removing link (1,2) gives 7T T ex < 7T T ex, 
even if Vi > (see Example [2]) . 



Remark. Let us note that, if the node i does not have an access to the set X, 
then for every deletion of a link starting from i, the PageRank of I will not 
be modified. Indeed, in this case 8 T v = since by Lemma W[b), Vj = 
for every j <— i. 

3.3 Basic absorbing graph 

Now, let us introduce briefly the notion of basic absorbing graph (see Chap- 
ter III about absorbing Markov chains in Kemeny and Snell's book |10j). 

For a given graph (TV, £) and a specified subset of nodes 1 C J\f, the basic 
absorbing graph is the graph (N, £°) defined by £^ ut ^ = 0, £j = {{i, i): i E 
-^}> ^in(X) = ^in(x) an( ^ ^ = ^ n other words, the basic absorbing graph 
(M, £°) is a graph constructed from (Af, £), keeping the same sets of external 
inlinks and external links £j n (j:),£j, removing the external outlinks £ ut(Z) 
and changing the internal link structure £j in order to have only self-links 
for nodes of I. 

Like in the previous subsection, every item corresponding to the basic 
absorbing graph will have a zero symbol. For instance, we will write tvq 
for the PageRank vector corresponding to the basic absorbing graph and 
V = argmax jgT [(I - cP ) _1 ej]j. 

Proposition 8 (PageRank for a basic absorbing graph). Let a graph defined 
by a set of links £ and let I C J\f . Then 

7T T ej < 7Tq ej, 

with equality if and only if £ ou t(l) = 0- 

Proof. Up to a permutation of the indices, equation ([2]) can be written as 
(1-cPx -cP out(z) \ (v x \ = (1\ 

V-^in(X) I-cPxJKVx) W 
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so we get 

_ ( v% 

By Lemma [U^b) and since (I — cPj)" 1 is a nonnegative matrix (see for 
instance the chapter on M-matrices in Berman and Plemmons's book [1]), 
we then have 

with equality if and only if no node of X has an access to X, that is £ ou t(i) = 0- 
The conclusion now follows from equation ([3]) and z > 0. □ 

Let us finally prove a nice property of the set V when X = {i} is a 
singleton: it is independent of the outlinks of i. In particular, it can be 
found from the basic absorbing graph. 

Lemma 9. Let a graph defined by a set of links £ and let X = {i} Then there 
exists such that (I — cP) _1 ej = a(I — cPq)~ 1 ei . As a consequence, 

V = V . 

Proof. Let X = {i}. Since vj = Vi is a scalar, it follows from equation (J3J) 
that the direction of the vector v does not depend on £j and £ ou t(x) but 
only on £- m (x) and £j. □ 




4 Optimal linkage strategy for a website 

In this section, we consider a set of nodes X. For this set, we want to choose 
the sets of internal links £j C I x I and external outlinks £ ut(:r) ^ X x X 
in order to maximize the PageRank score of X, that is TT T ej. 

Let us first discuss about the constraints on £ we will consider. If we do 
not impose any condition on £, the problem of maximizing n T ej is quite 
trivial. As shown by Proposition [HJ you should take in this case £ on t(i) — 
and £x an arbitrary subset of X x X such that each node has at least 
one outlink. You just try to lure the random walker to your pages, not 
allowing him to leave X except by zapping according to the preference vector. 
Therefore, it seems sensible to impose that £ ou t(i) must be nonempty. 

Now, let us show that, in order to avoid trivial solutions to our maxi- 
mization problem, it is not enough to assume that £ ou t(i) must be nonempty. 
Indeed, with this single constraint, in order to lose as few as possible visits 
from the random walker, you should take a unique leaking node k 6 X (i.e. 
^out(x) = for some i G X) and isolate it from the rest of the set X 

(i.e. {i el: (i,k) £ £ x } = 0). 

Moreover, it seems reasonable to imagine that Google penalizes (or at 
least tries to penalize) such behavior in the context of spam alliances [SJ. 
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All this discussion leads us to make the following assumption. 



Assumption A (Accessibility). Every node of X has an access to at least 



Let us now explain the basic ideas we will use in order to determine an 
optimal linkage strategy for a set of webpages X. We determine some forbid- 
den patterns for an optimal linkage strategy and deduce the only possible 
structure an optimal strategy can have. In other words, we assume that 
we have a configuration which gives an optimal PageRank 7r T ej. Then we 
prove that if some particular pattern appeared in this optimal structure, 
then we could construct another graph for which the PageRank 7v ej is 
strictly higher than 7v T ej. 

We will firstly determine the shape of an optimal external outlink struc- 
ture £ out(x) j when the internal link structure £j is given, in Theorem [TU1 
Then, given the external outlink structure £ out (i) we will determine the pos- 
sible optimal internal link structure £j in Theorem [TTJ Finally, we will put 
both results together in Theorem 1121 in order to get the general shape of an 
optimal linkage strategy for a set X when £- m {j) and £ j are given. 

Proofs of this section will be illustrated by several figures for which we 
take the following drawing convention. 

Convention. When nodes are drawn from left to right on the same horizon- 
tal line, they are arranged by decreasing value of Vj. Links are represented 
by continuous arrows and paths by dashed arrows. 

The first result of this section concerns the optimal outlink structure 
£ out (x) f° r the set X, while its internal structure £j is given. An example of 
optimal outlink structure is given after the theorem. 

Theorem 10 (Optimal outlink structure). Let £%, £- in (x) an d <% be given. 
Let J-'i, . . . , !F r be the final classes of the subgraph (X,£j). Let £ ont (i) such 
that the PageRank ir T ej is maximal under Assumption A. Then £ on t(x) has 
the following structure: 



one node of X. 



£, 



out (J) — ^out(^i) U • • • U <?out(^ r ) 



where for every s = 1, . . . , r 



out(^ s ) 



C i £ argmini>£. and j G V}. 



Moreover for every s = 1, . . . 



r, if£f s + 0, then \£ ut(^) \ = 1- 
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Proof. Let £j, £- m (x) and % be given. Suppose £ ut(x) is such that 7r ej is 
maximal under Assumption A. 

We will determine the possible leaking nodes of X by analyzing three 
different cases. 

Firstly, let us consider some node i G X such that i does not have children 
in X, i.e. {k G X: (i, k) G £j} = 0. Then clearly we have {i} = T a for some 
s = 1, . . . , r, with i G argmin fce jr and f^r = 0. From Assumption A, we 
have £ \it{T a ) 7^ 0> an d from Theorem [5] and the optimality assumption, we 
have £ out (jr ) C : j £ V} (see Figure SJ). 




Figure 4: Ifu 3 - < then tt e x > 7r T ei with £ out (z) = £ ou t(z)U{(M)}\ 

Secondly, let us consider some j £ I such that i has children in X, i.e. 
{k el: (i, k) e£i}^$ and 

Vi < minVfc. 

fc<— i 
fcGX 

Let j G argmin fc( _j V}.. Then j G X and < «j by Lemma [Tfc). Sup- 
pose by contradiction that the node i would keep an access to X if we took 
^out(Z) = £out(Z) \ {(hj)} instead of £ out (z). Then, by Proposition con- 
sidering f ou t(i) instead of £ ou t(J) would increase strictly the PageRank of X 
while Assumption A remains satisfied (see Figure [5]). This would contradict 




Figure 5: If Vj = min/^,; Vk and i has another access to X, then 7r ej > 
TT T e x with £ out (z) = £ ut(z) \ {(*>.?)}■ 

the optimality assumption for 5 ut(J)- From this, we conclude that 

• the node i belongs to final class T s of the subgraph (X, £■£) with ^ 
for some s = 1, . . . , r; 

• there does not exist another t G X, I / j such that G £ ou t(z); 

• there does not exist another k in the same final class J- s , k ^ i such 
that such that (k, £) G f ou t(J) f° r some £ el. 

Again, by Theorem [5] and the optimality assumption, we have j G V (see 
Figure H]). 
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Let us now notice that 



maxVfr < min^fc. (5) 

Indeed, with i G argmin fcg j v^, we are in one of the two cases analyzed above 
for which we have seen that v\ > Vj = argmax fcg ji>fc. 

Finally, consider a node i G X that does not belong to any of the final 
classes of the subgraph (I,£j). Suppose by contradiction that there exists 
j G X such that G £ ut(z)- Let I G argrnin fe< _ i 'i;fc. Then it follows 

from inequality © that I G X. But the same argument as above shows 
that the link {%,£) G £ ut(Z) m ust be removed since £ out (i) is supposed to 
be optimal (see Figure [5] again) . So, there does not exist j el such that 
G £ ut(J) f° r a node i £ I which does not belong to any of the final 
classes T\ , ■ ■ ■ , J> . □ 

Example 3. Let us consider the graph given in Figure El The internal link 
structure £j, as well as £ m (x) an d <% are given. The subgraph (X, £■£) has two 
final classes T\ and Ti. With c = 0.85 and z the uniform probability vector, 
this configuration has six optimal outlink structures (one of these solutions 
is represented by bold arrows in Figure Each one can be written as 
£out(x) = ^out(^i) Uf out (j 2 ), with £ out (;Fi) = {(4,6)} or ^out^) = {(4,7)} 
and ^ ^outf^b) — {(^> 6), (5, 7)}. Indeed, since 8^ L ^ 0, as stated by 
Theorem [T0| the final class T\ has exactly one external outlink in every 
optimal outlink structure. On the other hand, the final class Ti may have 
several external outlinks, since it is composed of a unique node and moreover 
this node does not have a self-link. Note that V = {6, 7} in each of these six 
optimal configurations, but this set V can not be determined a priori since 
it depends on the chosen outlink structure. 

Now, let us determine the optimal internal link structure £■% for the set 
X, while its outlink structure £ ut(z) is given. Examples of optimal internal 
structure are given after the proof of the theorem. 

Theorem 11 (Optimal internal link structure). Let £ ou t(i), £i n (i) an d ■% 
be given. Let C = {i G X: (i, j) G £ Q ut(i) f or some j G X} be the set of 
leaking nodes of I and let nc = \C\ be the number of leaking nodes. Let 
£j such that the PageRank n T ex is maximal under Assumption A. Then 
there exists a permutation of the indices such that X = {1,2, . . . ,nj}, C = 
{nj - n c + 1, . . . ,nx}, 

«!>•••> V ni _ nc > V ni ^ nc+ i > ■ ■ ■ > V ni , 

and £x has the following structure: 

£j C £ x C £j , 
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Figure 6: Bold arrows represent one of the six optimal outlink structures 
for this configuration with two final classes (see Example [3|) . 

where 

£% = {(i,j) £lxl: j <i}U{(i,j) € (l\£)xl: j = i + l}, 
£% = £%U{(i,j)e£xjC:i<j}. 

Proof. Let £ ou t(x)> ^in(z) an d <% be given. Suppose £j is such that iv T ej is 
maximal under Assumption A. 

Firstly, by Proposition [6] and since every node of 1 has an access to X, 
every node i € 1 links to every node j € 1 such that Vj > v (see Figure [7J , 
that is 

{0',i) &£x-Vi< vj} = Elxl-.Vi< vj}. (6) 



X 




Figure 7: Every i el must link to every j € 1 with > V{. 

Secondly, let (k, i) € £j such that k ^ i and € 1\C. Let us prove that, 
if the node i has an access to X by a path (i, i±, . . . , i s ) such that ij ^ k for 
all j = 1, . . . , s and i s £ X, then t?j < u& (see Figure [8]). Indeed, if we had 
v k < v i then, by Lemma[T|c), there would exists i £ 1 such that (k,£) £ £j 
and V£ = minj^fc-Uj < V; L < v^. But, with £x = £j \ {(k,£)}, we would 
have 7T T ej > 7T T ej by Proposition [7] while Assumption A remains satisfied 
since the node k would keep access to 1 via the node i (see Figure [9|) . That 
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Figure 8: The node i can not have an access to X without crossing k 
since in this case we should then have Vi < v^. 



X 




Figure 9: If v e = mmj^ k Vj, then it e x > 7v T ej with £ out (x) = £ ut(x) \ 
{(k,£)}. 

contradicts the optimality assumption. This leads us to the conclusion that 
Vk > Vi for every k £ Z\C and i € C. Moreover ^ V/, for every i, k S 1\C, 
i ^ k. Indeed, if we had Vi = v^, then (k, i) G £j by ([6]) while by Lemma El 
the node i would have an access to I by a path independant from k. So we 
should have V{ < v^. 

We conclude from this that we can relabel the nodes of TV such that 
X = {1, 2, . . . nj}, C = {nx - n c + 1, ■ ■ ■ , n x } and 

V 1 > V 2 > ■ ■ ■ > V nx - nc > V nj .- nc+ i >■■■> V nx . (7) 

It follows also that, for i £ Z \ C and j > i, 6 Si if and only if j = 
i + 1. Indeed, suppose first i < m — nc- Then, we cannot have € £% 
with j > i+1 since in this case we would contradict the ordering of the nodes 
given by equation (|7|) (see Figure [H] again with k = i + 1 and remember that 
by Lemma [3l node j has an access to X by a decreasing path). Moreover, 
node i must link to some node j > i in order to satisfy Assumption A, so 
must belong to E%. Now, consider the case i = nj — nc- Suppose we 
had € £ x with j > Let us first note that there can not exist two or 
more different links (i, £) with I E C since in this case we could remove one 
of these links and increase strictly the PageRank of the set I. If Vj = fj+i, 
we could relabel the nodes by permuting these two indices. If Vj < 
then with Sj = £ x U + 1)} \ {(i,j)}, we would have 7r T ej > n T ej 
by Theorem [5] while Assumption A remains satisfied since the i would keep 
access to I via node i + 1. That contradicts the optimality assumption. So 
we have proved that 

e£ I :i<jandiel\£} = {(*,*+ 1): i EZ\£}. (8) 

Thirdly, it is obvious that 

{{hi) eS T :i< j and ie£}C{(i,j) e£x£:i<j}. (9) 
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The announced structure for a set £% giving a maximal PageRank score 
7v T ex under Assumption A now follows directly from equations ([6]), (|8|) 
and Q. □ 



Example 4. Let us consider the graphs given in Figure fTUl For both cases, 
the external outlink structure £ ut(:r) with two leaking nodes, as well as £i n (i) 
and £j are given. With c = 0.85 and z the uniform probability vector, the 
optimal internal link structure for configuration (a) is given by £j = £%, 
while in configuration (b) we have £j = £% (bold arrows), with £j and £j 
defined in Theorem II II 




Figure 10: Bold arrows represent optimal internal link structures. In (a) 
we have £x = £j , while £x = £% in (b). 



Finally, combining the optimal outlink structure and the optimal internal 
link structure described in Theorems II Ul and II II we find the optimal linkage 
strategy for a set of webpages. Let us note that, since we have here control 
on both £j and £ ut(J)i there are no more cases of several final classes or 
several leaking nodes to consider. For an example of optimal link structure, 
see Figure [TJ 

Theorem 12 (Optimal link structure). Let £ in {x) and £j be given. Let £j 
and £ on t(i) such that Tv T ej is maximal under Assumption A. Then there 
exists a permutation of the indices such that 2 = {1, 2, . . . , nj}, 

v± > ■ ■ ■ > v ni > v nx+ i >■■■> v n , 
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and £j and £ ut(J) have the following structure: 

£x = {(*,j) elxl: j <i orj = i+l}, 
£ont(X) = {(ni,nx + 1)}. 

Proof. Let £- m (x) an d <% be given and suppose £j and £ ut(x) are such that 
7r T ej is maximal under Assumption A. Let us relabel the nodes of Af such 
that X = {1, 2, . . . , nx) and v\ > ■■■ > v n% > v ni+ \ = max-^jVj. By 
TheoremllH (i, j) € £j for every nodes i,j £2 such that j < i. In particular, 
every node of 2 has an access to node 1. Therefore, there is a unique final 
class T\ C 2" in the subgraph (2,£j). So, by Theorem [TU1 £ out (j) = {(&,■£)} 
for some /c G T\ and £ € X. Without loss of generality, we can suppose that 
I = nx + 1. By Theorem 1111 again, the leaking node k = nj and therefore 
(z, i + 1) 6 Ex for every node i £ {1, . . . ,nx — 1}. □ 

Let us note that having a structure like described in Theorem [12] is a 
necessary but not sufficient condition in order to have a maximal PageRank. 

Example 5. Let us show by an example that the graph structure given in 
Theorem 1121 is not sufficient to have a maximal PageRank. Consider for in- 
stance the graphs in Figure [TTJ Let c = 0.85 and a uniform personalization 
vector z = — 1. Both graphs have the link structure required Theorem 1121 in 

order to have a maximal PageRank, with vr a \ = (6.484 6.42 6.224 5.457)"^ 
and v {b) = (6.432 6.494 6.247 5.52) T . But the configuration (a) is 
not optimal since in this case, the PageRank TrJ\ex = 0.922 is strictly 
less than the PageRank TrT^ex = 0.926 obtained by the configuration (b). 
Let us nevertheless note that, with a non uniform personalization vector 
z = (0.7 0.1 0.1 O.l) , the link structure (a) would be optimal. 



1 








- yrr — 






— i 
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(a) 



1 
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1 




3 















(b) 

Figure 11: For X = {1,2,3}, c = 0.85 and z uniform, the link struc- 
ture in (a) is not optimal and yet it satisfies the necessary conditions of 
Theorem [p3] (see Example [S]) . 
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5 Extensions and variants 



Let us now present some extensions and variants of the results of the previous 
section. We will first emphasize the role of parents of X. Secondly, we will 
briefly talk about Avrachenkov-Litvak's optimal link structure for the case 
where X is a singleton. Then we will give variants of Theorem [12] when 
self-links are forbidden or when a minimal number of external outlinks is 
required. Finally, we will make some comments of the influence of external 
inlinks on the PageRank of X. 

5.1 Linking to parents 

If some node of I has at least one parent in X then the optimal linkage strat- 
egy for X is to have an internal link structure like described in Theorem [12] 
together with a single link to one of the parents of X. 

Corollary 13 (Necessity of linking to parents). Let £i n (x)7^ and £^ be 
given. Let Ex and £ ut(z) such that 7v T ej is maximal under Assumption A. 
Then £ ut(x) = {(hj)}> f or some i 6 X and j 6 X such that (j,k) 6 fwx) 
for some k £ X. 

Proof. This is a direct consequence of Lemma [2] and Theorem [12] □ 

Let us nevertheless remember that not every parent of nodes of I will 
give an optimal link structure, as we have already discussed in Example Q] 
and we develop now. 

Example 6. Let us continue Example[T] We consider the graph in Figure[2]as 
basic absorbing graph for X = {1}, that is £i n (;r) an d <% are given. We take 
c = 0.85 as damping factor and a uniform personalization vector z = -1. 
We have seen in Example H] than Vo = {2, 3, 4}. Let us consider the value of 
the PageRank 7Ti for different sets £j and £ ou tm: 

^out(I) 





{(1,2)} 


{(1,5)} 


{(1,6)} 


{(1,2), (1,3)} 


£r = 


/ 0.1739 


0.1402 


0.1392 


0.1739 


£r = {(M)} 


0.5150 0.2600 


0.2204 


0.2192 


0.2231 



As expected from Corollary 1151 the optimal linkage strategy for X = {1} is 
to have a self-link and a link to one of the nodes 2, 3 or 4. We note also that 
a link to node 6, which is a parent of node 1 provides a lower PageRank that 
a link to node 5, which is not parent of 1. Finally, if we suppose self-links 
are forbidden (see below), then the optimal linkage strategy is to link to one 
or more of the nodes 2, 3, 4. 

In the case where no node of X has a parent in 2~, then every structure 
like described in Theorem 1121 will give an optimal link structure. 
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Proposition 14 (No external parent). Let £\ n (x) and £j be given. Suppose 
that £ia(T) = 0- Then the PageRank 7v T ej is maximal under Assumption A 
if and only if 

£j = elxl: j <i orj = i+l}, 

4ut(X) = {{nj,nx + 1)}. 
for some permutation of the indices such that I = {1, 2, . . . , nj}. 

Proof. This follows directly from ir T ex = (1 — c)z T v and the fact that, if 
^in(Z) = 0> 

v = (I - cP) ej =■ 1 y 





up to a permutation of the indices. □ 

5.2 Optimal linkage strategy for a singleton 

The optimal outlink structure for a single webpage has already been given 
by Avrachenkov and Litvak in [2]. Their result becomes a particular case of 
Theorem [12j Note that in the case of a single node, the possible choices for 
£ ou tm can be found a priori by considering the basic absorbing graph, since 
V = V . 

Corollary 15 (Optimal link structure for a single node). Let 2 = {i} and 

let £\ n (z) and £j be given. Then the PageRank 7Tj is maximal under Assump- 
tion A if and only if £j = {(i,i)} and £ ut(z) = {(hj)} f or some j G Vo- 

Proof. This follows directly from Lemma E] and Theorem [T2l □ 

5.3 Optimal linkage strategy under additional assumptions 

Let us consider the problem of maximizing the PageRank ir T ex when self- 
links are forbidden. Indeed, it seems to be often supposed that Google's 
PageRank algorithm does not take self- links into account. In this case, 
Theorem 1121 can be adapted readily for the case where \I\ > 2. When X is 
a singleton, we must have £x = 0, so £ ut(z) can contain several links, as 
stated in Theorem 1101 

Corollary 16 (Optimal link structure with no self-links). Suppose |Z| > 2. 
Let £ m cx) and £j be given. Let £x and £ ut(z) such that ir T ex is maximal 
under Assumption A and assumption that there does not exist i € I such 
that {(i,i)} € £%. Then there exists a permutation of the indices such that 
1 = {1, 2, ... , nx}, vi> ■ ■ ■ > v nx > v nx+ i >■•■>»„, and £ x and £ out (x) 
have the following structure: 

£x = G I x J: j <i orj = i+l}, 

Sout(T) = {(nx,nx + 1)}. 
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Corollary 17 (Optimal link structure for a single node with no self- link). 
Suppose 2 = {i}. Let £i n m and £j be given. Suppose £j = 0. Then the 
PageRank 7Tj is maximal under Assumption A if and only if ^ £ ut(i) Q Hi ■ 

Let us now consider the problem of maximizing the PageRank Tv T ej 
when several external outlinks are required. Then the proof of Theorem [10] 
can be adapted readily in order to have the following variant of Theorem 1 121 

Corollary 18 (Optimal link structure with several external outlinks). Let 

£- m m and <% be given. Let £j and £ ut(Z) such that 7r T ej is maximal un- 
der Assumption A and assumption that |£ ut(x)| — r ■ Then there exists a 
permutation of the indices such that 2 = {1,2,..., nj}, V\ > ■ ■ ■ > v ni > 
v m+i — ■ ■ ■ and £j and £ ut(i) have the following structure: 

£x = S 2 x 2: j < i or j = i + 1}, 

£out(X) = {(ni, jk) ■ 3k € V for k = 1, . . . , r}. 

5.4 External inlinks 

Finally, let us make some comments about the addition of external inlinks to 
the set 2. It is well known that adding an inlink to a particular page always 
increases the PageRank of this page [H [9]. This can be viewed as a direct 
consequence of Theorem [5] and Lemma [TJ The case of a set of several pages 
2 is not so simple. We prove in the following theorem that, if the set 2 has 
a link structure as described in Theorem 1 121 then adding an inlink to a page 
of 2 from a page j G 2 which is not a parent of some node of 2 will increase 
the PageRank of 2. But in general, adding an inlink to some page of 2 from 
2 may decrease the PageRank of the set 2, as shown in Examples [7] and El 

Theorem 19 (External inlinks). Let 2 C J\f and a graph defined by a set 
of links £ . If 

mm.Vi > maxi;,-, 

iei Hi 

then, for every j € 2 which is not a parent of 2, and for every i £ 2, the 
graph defined by £ = £ U {(j, i)} gives 7T T ej > 7r T ej. 

Proof. This follows directly from Theorem [SJ □ 

Example 7. Let us show by an example that a new external inlink is not 
always profitable for a set 2 in order to improve its PageRank, even if 2 has 
an optimal linkage strategy. Consider for instance the graph in Figure [T2"l 
With c = 0.85 and z uniform, we have ir T ej = 0.8481. But if we consider 
the graph defined by £i n (x) = £in(J)U{(3, 2)}, then we have 7r T ej = 0.8321 < 
TV 1 ex. 
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Figure 12: For T = {1, 2}, adding the external inlink (3, 2) gives 7r T ej < 
7r T ei (see Example [7]) . 

Example 8. A new external inlink does not not always increase the PageRank 
of a set X in even if this new inlink comes from a page which is not already a 
parent of some node of X. Consider for instance the graph in Figure [13] With 
c = 0.85 and z uniform, we have 7v T ex = 0.6. But if we consider the graph 
defined by £- m m = £- m n) U {(4, 3)}, then we have n T ex = 0.5897 < 7T T ej. 




Figure 13: For X = {1,2,3}, adding the external inlink (4,3) gives 
7T T ex < 7r T ej (see Example [5]) . 



6 Conclusions 

In this paper we provide the general shape of an optimal link structure 
for a website in order to maximize its PageRank. This structure with a 
forward chain and every possible backward links may be not intuitive. At 
our knowledge, it has never been mentioned, while topologies like a clique, 
a ring or a star are considered in the literature on collusion and alliance 
between pages [8]. Moreover, this optimal structure gives new insight 
into the affirmation of Bianchini et al. [S] that, in order to maximize the 
PageRank of a website, hyperlinks to the rest of the webgraph "should be 
in pages with a small PageRank and that have many internal hyperlinks". 
More precisely, we have seen that the leaking pages must be choosen with 
respect to the mean number of visits before zapping they give to the website, 
rather than their PageRank. 

Let us now present some possible directions for future work. 

We have noticed in Example that the first node of X in the forward 
chain of an optimal link structure is not necessarily a child of some node of 
X. In the example we gave, the personalization vector was not uniform. We 
wonder if this could occur with a uniform personalization vector and make 
the following conjecture. 
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Conjecture. Let E- m n\^ and £j be given. Let Ex and £ ut(i) such that 
7v T ex is maximal under Assumption A. If z = — 1, then there exists j G 2" 
such that (j, i) € £wz), where i G argmax^Vfc. 



If this conjecture was true we could also ask if the node j G I such that 
(j, i) G £i n (x) where i G argmax fc belongs to V. 

Another question concerns the optimal linkage strategy in order to max- 
imize an arbitrary linear combination of the PageRanks of the nodes of I. 
In particular, we could want to maximize the PageRank ir T es of a target 
subset S C I by choosing Ex and £ ut(x) as usual. A general shape for 
an optimal link structure seems difficult to find, as shown in the following 
example. 

Example 9. Consider the graphs in Figure PTH In both cases, let c = 0.85 
and z = ^1. Let 1 = {1,2,3} and let S = {1,2} be the target set. In the 
configuration (a), the optimal sets of links Ex and £ ut(J) f° r maximizing 
n T es has the link structure described in Theorem [T2J But in (a), the 



optimal Ex and E, 



out(J) 



do not have this structure. Let us note nevertheless 



that, by Theorem [T2l the subsets £s and £ ut(S) 
described in Theorem 1121 



must have the link structure 
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-*f2 
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X 



s 



















(b) 



Figure 14: In (a) and (b), bold arrows represent optimal link structures 
for X — {1, 2, 3} with respect to a target set S — {1, 2} (see Example^])- 
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